How to Rein in the Volatile Actor: A New Bounded Perspective
نویسنده
چکیده
Actor-critic algorithms are amongst the most well-studied reinforcement learning algorithms that can be used to solve Markov decision processes (MDPs) via simulation. Unfortunately, the parameters of the so-called “actor” in the classical actor-critic algorithm exhibit great volatility — getting unbounded in practice, whence they have to be artificially constrained to obtain solutions in practice. The algorithm is often used in conjunction with Boltzmann action selection, where one may have to use a temperature to get the algorithm to work, but the convergence of the algorithm has only been proved when the temperature equals 1. We propose a new actor-critic algorithm whose actor’s parameters are bounded. We present a mathematical proof of the boundedness and test our algorithm on small-scale MDPs for infinite horizon discounted reward. Our algorithm produces encouraging numerical results.
منابع مشابه
The Symbiosis of Human and Semantic Technology Through the Lens of Actor-Network Theory
Background: Semantic technologies (STs) have made machine reasoning possible by providing intelligent data management methods. This capability has created new forms of interaction between humans and STs, which is called "semantic interaction." The increasing spread of this form of interaction in daily life reveals the need to identify the factors affecting it and introduce the requirements of...
متن کاملDeconstruction of Language and Expression in Kiarostami’s Cinema A case study on “Shirin”
This article aims to study the significant language and expression methods of Abbas Kiarostami’s cinema by analyzing the context and structure of a movie titled Shirin, focusing on its narrative and internal elements in a deconstructive manner .The movie is a masterpiece in which life’s passion is intermingled with death, nothingness, and despair. Analyzing the movie Shirin is an attempt to red...
متن کاملExplaining the Role of Management Accounting Information System in Strategy Formulation with Actors Network Approach
The real challenge of business environment is derived from a situation where organizations need to find opportunities on how to introduce ideas and new products to market that provide future earnings stream. Management accounting is used as a tool in this process and provides information on opportunities and threats. The purpose of this research is to explain the role of management accounting i...
متن کاملHomomorphisms on Topological Groups from the Perspective of Bourbaki-boundedness
In this note we study some topological properties of bounded sets and Bourbaki-bounded sets. Also we introduce two types of Bourbaki-bounded homomorphisms on topological groups including, n$-$Bourbaki-bounded homomorphisms and$hspace{1mm}$ B$-$Bourbaki-bounded homomorphisms. We compare them to each other and with the class of continuous homomorphisms. So, two topologies are presented on them a...
متن کاملConceptual Agent based Modeling in Supply Chain: An Economic Perspective
Abstract: The implementation of government legislation, social responsibility, environmental concerns regarding the reduction of waste, hazardous material and other consumer residuals have made the competition between the firms stricter than ever and nowadays firms that want to survive need a more productive and innovative approach toward the financial aspects of their businesses.his pape...
متن کامل